Goto

Collaborating Authors

 use cross entropy loss


[D] seq2seq why use cross entropy loss? • r/MachineLearning

@machinelearnbot

If we use word embedding in our seq2seq model, why don't we just use the distance between 2 vectors as a loss function instead of softmax cross entropy?